Skip to content

Conversation

Pouyanpi
Copy link
Collaborator

@Pouyanpi Pouyanpi commented Sep 30, 2025

Stack Info

This PR is part of a stack:

  1. refactor!: drop reasoning trace extraction logic #1427 (this PR)
  2. feat: add reasoning trace extraction from llm calls #1431
  3. feat: emit BotThinking events with reasoning traces #1432
  4. feat(logging): improve event logging and add bot thinking display #1434

⚠️ Review order: Start here, then proceed through #1431#1432#1434

Description

Replaces custom reasoning trace extraction with LangChain's built-in additional_kwargs approach, removing ~1,500 lines of code.

Breaking Changes

  • reasoning_config (including start_token, end_token, remove_reasoning_traces) removed from model configuration
  • apply_to_reasoning_traces removed from output rails configuration

Migration

Remove reasoning_config sections from config files. Reasoning traces are now automatically extracted by LangChain for supported models (e.g., DeepSeek-R1).

  • TODO:
  • deprecate reasoning_config from config gracefully

Changes

  • Removed: ReasoningModelConfig, ParsedTaskOutput, token extraction logic
  • Added: Automatic extraction from response.additional_kwargs['reasoning_content']
  • Deleted: 4 test files (~1,085 lines)

@Pouyanpi Pouyanpi marked this pull request as draft September 30, 2025 15:28
…al_kwargs

BREAKING CHANGE: Remove reasoning_config and apply_to_reasoning_traces
config options
  - Remove ReasoningModelConfig, ParsedTaskOutput, and related
extraction logic
  - Replace with automatic extraction from
response.additional_kwargs['reasoning_content']
  - Remove 1,085 lines of test code and related infrastructure
  - Simplify parse_task_output() to return str directly
  - Update DeepSeek-R1 and Nemotron example configs

Reasoning traces are now automatically extracted by LangChain and stored
in additional_kwargs, eliminating the need for token-based parsing and
complex configuration options.
@Pouyanpi Pouyanpi force-pushed the refactor/reasoning-traces-langchain-integration branch from c4d3366 to 8aee801 Compare September 30, 2025 15:47
@Pouyanpi Pouyanpi changed the title refactor!: replace reasoning trace extraction with LangChain addition… refactor!: drop reasoning trace extraction logic Sep 30, 2025
@codecov-commenter
Copy link

Codecov Report

❌ Patch coverage is 62.50000% with 3 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
nemoguardrails/actions/llm/utils.py 50.00% 3 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Collaborator

@tgasser-nv tgasser-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just want to check we cover lines 201-3 of actions/llm/utils.py in another of the PRs in the stack

@trebedea
Copy link
Member

trebedea commented Oct 7, 2025

@Pouyanpi , shouldn't we keep rails.output.apply_to_reasoning_traces functionality?
Or you are adding it again in one of the follow-up MRs?

@Pouyanpi
Copy link
Collaborator Author

Pouyanpi commented Oct 7, 2025

@Pouyanpi , shouldn't we keep rails.output.apply_to_reasoning_traces functionality? Or you are adding it again in one of the follow-up MRs?

Not anymore, now if someone wants to use a specific output rails that checks the reasoning content should include it in the prompt template (see https://github.com/NVIDIA-NeMo/Guardrails/pull/1432/files#diff-1eb0f307d462ecfbe7710b361f4d296d3648a44bfcf85fda895c6942b2e4606dR37-R39 ) and use a rail that supports guardrailing reasoning contents ( for example see how actions.py is modified)

There needs to be a subsequent PR that updates some of the actions.

@trebedea
Copy link
Member

trebedea commented Oct 7, 2025

Not anymore, now if someone wants to use a specific output rails that checks the reasoning content should include it in the prompt template (see https://github.com/NVIDIA-NeMo/Guardrails/pull/1432/files#diff-1eb0f307d462ecfbe7710b361f4d296d3648a44bfcf85fda895c6942b2e4606dR37-R39 ) and use a rail that supports guardrailing reasoning contents ( for example see how actions.py is modified)

There needs to be a subsequent PR that updates some of the actions.

I don't have anything against it, but for me it seemed like a useful option to be able to run any output rails on response only vs (thinking + response) by only adding a config flag. I agree that with the current implementation even more complex output rails can be defined way easier, but maybe it's not a bad idea to have the flag anyway.

@Pouyanpi
Copy link
Collaborator Author

I don't have anything against it, but for me it seemed like a useful option to be able to run any output rails on response only vs (thinking + response) by only adding a config flag. I agree that with the current implementation even more complex output rails can be defined way easier, but maybe it's not a bad idea to have the flag anyway.

I understand your point. As of introducing these PRs we are treating bot thinking and thus reasoning traces as a distinct entities. I'll open a PR that addresses this, we can keep it for sake of backward compatibility until we have proper rails in place.

@Pouyanpi Pouyanpi merged commit 785852d into develop Oct 13, 2025
31 checks passed
@Pouyanpi Pouyanpi deleted the refactor/reasoning-traces-langchain-integration branch October 13, 2025 07:51
tgasser-nv pushed a commit that referenced this pull request Oct 14, 2025
…al_kwargs (#1427)

BREAKING CHANGE: Remove reasoning_config and apply_to_reasoning_traces
config options
  - Remove ReasoningModelConfig, ParsedTaskOutput, and related
extraction logic
  - Replace with automatic extraction from
response.additional_kwargs['reasoning_content']
  - Remove 1,085 lines of test code and related infrastructure
  - Simplify parse_task_output() to return str directly
  - Update DeepSeek-R1 and Nemotron example configs

Reasoning traces are now automatically extracted by LangChain and stored
in additional_kwargs, eliminating the need for token-based parsing and
complex configuration options (this is valid for nvidia_ai_endpoints and deepseek engines).
tgasser-nv pushed a commit that referenced this pull request Oct 14, 2025
…al_kwargs (#1427)

BREAKING CHANGE: Remove reasoning_config and apply_to_reasoning_traces
config options
  - Remove ReasoningModelConfig, ParsedTaskOutput, and related
extraction logic
  - Replace with automatic extraction from
response.additional_kwargs['reasoning_content']
  - Remove 1,085 lines of test code and related infrastructure
  - Simplify parse_task_output() to return str directly
  - Update DeepSeek-R1 and Nemotron example configs

Reasoning traces are now automatically extracted by LangChain and stored
in additional_kwargs, eliminating the need for token-based parsing and
complex configuration options (this is valid for nvidia_ai_endpoints and deepseek engines).
tgasser-nv pushed a commit that referenced this pull request Oct 14, 2025
…al_kwargs (#1427)

BREAKING CHANGE: Remove reasoning_config and apply_to_reasoning_traces
config options
  - Remove ReasoningModelConfig, ParsedTaskOutput, and related
extraction logic
  - Replace with automatic extraction from
response.additional_kwargs['reasoning_content']
  - Remove 1,085 lines of test code and related infrastructure
  - Simplify parse_task_output() to return str directly
  - Update DeepSeek-R1 and Nemotron example configs

Reasoning traces are now automatically extracted by LangChain and stored
in additional_kwargs, eliminating the need for token-based parsing and
complex configuration options (this is valid for nvidia_ai_endpoints and deepseek engines).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants